Picture for Chengyu Wang

Chengyu Wang

DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models

Add code
Apr 21, 2025
Viaarxiv icon

Understanding Attention Mechanism in Video Diffusion Models

Add code
Apr 16, 2025
Viaarxiv icon

Training Small Reasoning LLMs with Cognitive Preference Alignment

Add code
Apr 14, 2025
Viaarxiv icon

A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions

Add code
Apr 12, 2025
Viaarxiv icon

Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud

Add code
Dec 06, 2024
Figure 1 for Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud
Figure 2 for Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud
Figure 3 for Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud
Figure 4 for Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud
Viaarxiv icon

MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image

Add code
Nov 25, 2024
Figure 1 for MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image
Figure 2 for MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image
Figure 3 for MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image
Figure 4 for MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image
Viaarxiv icon

Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts

Add code
Nov 23, 2024
Viaarxiv icon

Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

Add code
Oct 14, 2024
Figure 1 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Figure 2 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Figure 3 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Figure 4 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Viaarxiv icon

VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models

Add code
Oct 01, 2024
Viaarxiv icon

DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding

Add code
Aug 27, 2024
Figure 1 for DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Figure 2 for DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Figure 3 for DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Figure 4 for DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Viaarxiv icon